{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "7c249b40",
   "metadata": {},
   "source": [
    "# Using Azure OpenAI\n",
    "\n",
    "This tutorial will show you how to use Azure OpenAI endpoints instead of OpenAI endpoints.\n",
    "\n",
    "\n",
    "- [Evaluation](#load-sample-dataset)\n",
    "- [Test set generation](#test-set-generation)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2e63f667",
   "metadata": {},
   "source": [
    ":::{Note}\n",
    "this guide is for folks who are using the Azure OpenAI endpoints. Check the [evaluation guide](../../getstarted/evaluation.md) if your using OpenAI endpoints.\n",
    ":::"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e54b5e01",
   "metadata": {},
   "source": [
    "### Load sample dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "b658e02f",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Found cached dataset fiqa (/home/jjmachan/.cache/huggingface/datasets/explodinggradients___fiqa/ragas_eval/1.0.0/3dc7b639f5b4b16509a3299a2ceb78bf5fe98ee6b5fee25e7d5e4d290c88efb8)\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "8cfcf43797d746c6a35d2c9eb9512abc",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "  0%|          | 0/1 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/plain": [
       "DatasetDict({\n",
       "    baseline: Dataset({\n",
       "        features: ['question', 'ground_truth', 'answer', 'contexts'],\n",
       "        num_rows: 30\n",
       "    })\n",
       "})"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# data\n",
    "from datasets import load_dataset\n",
    "\n",
    "amnesty_qa = load_dataset(\"explodinggradients/amnesty_qa\", \"english_v2\")\n",
    "amnesty_qa"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d4b8a69c",
   "metadata": {},
   "source": [
    "Lets import metrics that we are going to use. To learn more about what each metrics do, check out this [doc](https://docs.ragas.io/en/latest/concepts/metrics/index.html)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "f17bcf9d",
   "metadata": {},
   "outputs": [],
   "source": [
    "from ragas.metrics import (\n",
    "    context_precision,\n",
    "    answer_relevancy,\n",
    "    faithfulness,\n",
    "    context_recall,\n",
    ")\n",
    "from ragas.metrics.critique import harmfulness\n",
    "\n",
    "# list of metrics we're going to use\n",
    "metrics = [\n",
    "    faithfulness,\n",
    "    answer_relevancy,\n",
    "    context_recall,\n",
    "    context_precision,\n",
    "    harmfulness,\n",
    "]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c77789bb",
   "metadata": {},
   "source": [
    "### Configuring them for Azure OpenAI endpoints\n",
    "\n",
    "Ragas also uses AzureOpenAI for running some metrics so make sure you have your Azure OpenAI key, base URL and other information available in your environment. You can check the [langchain docs](https://python.langchain.com/docs/integrations/llms/azure_openai) or the [Azure docs](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/switching-endpoints) for more information.\n",
    "\n",
    "\n",
    "But basically you need the following information."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "68c6f3be-7935-401a-abdf-5eab73d7fe41",
   "metadata": {},
   "outputs": [],
   "source": [
    "azure_configs = {\n",
    "    \"base_url\": \"https://<your-endpoint>.openai.azure.com/\",\n",
    "    \"model_deployment\": \"your-deployment-name\",\n",
    "    \"model_name\": \"your-model-name\",\n",
    "    \"embedding_deployment\": \"your-deployment-name\",\n",
    "    \"embedding_name\": \"text-embedding-ada-002\",  # most likely\n",
    "}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "be0540bb-98c5-4bc9-89dc-0ee10a330e2c",
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "\n",
    "# assuming you already have you key available via your environment variable. If not use this\n",
    "# os.environ[\"AZURE_OPENAI_API_KEY\"] = \"...\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "36bb5f1e-14bd-4648-a6e2-72ff980550e0",
   "metadata": {},
   "source": [
    "Now lets create the chat model and embedding model instances so that ragas can use it for evaluation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "50110d32-8ac7-47ae-a75f-53f4dee694e3",
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain_openai.chat_models import AzureChatOpenAI\n",
    "from langchain_openai.embeddings import AzureOpenAIEmbeddings\n",
    "from ragas import evaluate\n",
    "\n",
    "azure_model = AzureChatOpenAI(\n",
    "    openai_api_version=\"2023-05-15\",\n",
    "    azure_endpoint=azure_configs[\"base_url\"],\n",
    "    azure_deployment=azure_configs[\"model_deployment\"],\n",
    "    model=azure_configs[\"model_name\"],\n",
    "    validate_base_url=False,\n",
    ")\n",
    "\n",
    "# init the embeddings for answer_relevancy, answer_correctness and answer_similarity\n",
    "azure_embeddings = AzureOpenAIEmbeddings(\n",
    "    openai_api_version=\"2023-05-15\",\n",
    "    azure_endpoint=azure_configs[\"base_url\"],\n",
    "    azure_deployment=azure_configs[\"embedding_deployment\"],\n",
    "    model=azure_configs[\"embedding_name\"],\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "44641e41",
   "metadata": {},
   "source": [
    "In case of any doubts on how to configure the Azure endpont through langchain do reffer to the [AzureChatOpenai](https://python.langchain.com/docs/integrations/chat/azure_chat_openai) and [AzureOpenAIEmbeddings](https://python.langchain.com/docs/integrations/text_embedding/azureopenai) documentations from the langchain docs."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8d6ecd5a",
   "metadata": {},
   "source": [
    "### Evaluation\n",
    "\n",
    "Running the evalutation is as simple as calling evaluate on the `Dataset` with the metrics of your choice."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "22eb6f97",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "e87bf94ac59448c48faf5bcc03667522",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Evaluating:   0%|          | 0/150 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/plain": [
       "{'faithfulness': 0.7083, 'answer_relevancy': 0.9416, 'context_recall': 0.7762, 'context_precision': 0.8000, 'harmfulness': 0.0000}"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "result = evaluate(\n",
    "    amnesty_qa[\"eval\"], metrics=metrics, llm=azure_model, embeddings=azure_embeddings\n",
    ")\n",
    "\n",
    "result"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a2dc0ec2",
   "metadata": {},
   "source": [
    "and there you have the it, all the scores you need.\n",
    "\n",
    "now if we want to dig into the results and figure out examples where your pipeline performed worse or really good you can easily convert it into a pandas array and use your standard analytics tools too!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "8686bf53",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>question</th>\n",
       "      <th>ground_truth</th>\n",
       "      <th>answer</th>\n",
       "      <th>contexts</th>\n",
       "      <th>faithfulness</th>\n",
       "      <th>answer_relevancy</th>\n",
       "      <th>context_recall</th>\n",
       "      <th>context_precision</th>\n",
       "      <th>harmfulness</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>How to deposit a cheque issued to an associate...</td>\n",
       "      <td>[Have the check reissued to the proper payee.J...</td>\n",
       "      <td>\\nThe best way to deposit a cheque issued to a...</td>\n",
       "      <td>[Just have the associate sign the back and the...</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.982491</td>\n",
       "      <td>0.888889</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Can I send a money order from USPS as a business?</td>\n",
       "      <td>[Sure you can.  You can fill in whatever you w...</td>\n",
       "      <td>\\nYes, you can send a money order from USPS as...</td>\n",
       "      <td>[Sure you can.  You can fill in whatever you w...</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.995249</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>1 EIN doing business under multiple business n...</td>\n",
       "      <td>[You're confusing a lot of things here. Compan...</td>\n",
       "      <td>\\nYes, it is possible to have one EIN doing bu...</td>\n",
       "      <td>[You're confusing a lot of things here. Compan...</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.948876</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Applying for and receiving business credit</td>\n",
       "      <td>[\"I'm afraid the great myth of limited liabili...</td>\n",
       "      <td>\\nApplying for and receiving business credit c...</td>\n",
       "      <td>[Set up a meeting with the bank that handles y...</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.813285</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>401k Transfer After Business Closure</td>\n",
       "      <td>[You should probably consult an attorney. Howe...</td>\n",
       "      <td>\\nIf your employer has closed and you need to ...</td>\n",
       "      <td>[The time horizon for your 401K/IRA is essenti...</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.894836</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                            question  \\\n",
       "0  How to deposit a cheque issued to an associate...   \n",
       "1  Can I send a money order from USPS as a business?   \n",
       "2  1 EIN doing business under multiple business n...   \n",
       "3         Applying for and receiving business credit   \n",
       "4               401k Transfer After Business Closure   \n",
       "\n",
       "                                       ground_truth  \\\n",
       "0  [Have the check reissued to the proper payee.J...   \n",
       "1  [Sure you can.  You can fill in whatever you w...   \n",
       "2  [You're confusing a lot of things here. Compan...   \n",
       "3  [\"I'm afraid the great myth of limited liabili...   \n",
       "4  [You should probably consult an attorney. Howe...   \n",
       "\n",
       "                                              answer  \\\n",
       "0  \\nThe best way to deposit a cheque issued to a...   \n",
       "1  \\nYes, you can send a money order from USPS as...   \n",
       "2  \\nYes, it is possible to have one EIN doing bu...   \n",
       "3  \\nApplying for and receiving business credit c...   \n",
       "4  \\nIf your employer has closed and you need to ...   \n",
       "\n",
       "                                            contexts  faithfulness  \\\n",
       "0  [Just have the associate sign the back and the...           1.0   \n",
       "1  [Sure you can.  You can fill in whatever you w...           1.0   \n",
       "2  [You're confusing a lot of things here. Compan...           1.0   \n",
       "3  [Set up a meeting with the bank that handles y...           1.0   \n",
       "4  [The time horizon for your 401K/IRA is essenti...           0.0   \n",
       "\n",
       "   answer_relevancy  context_recall  context_precision  harmfulness  \n",
       "0          0.982491        0.888889                1.0            0  \n",
       "1          0.995249        1.000000                1.0            0  \n",
       "2          0.948876        1.000000                1.0            0  \n",
       "3          0.813285        1.000000                1.0            0  \n",
       "4          0.894836        0.000000                0.0            0  "
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df = result.to_pandas()\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f668fce1",
   "metadata": {},
   "source": [
    "And thats it!\n",
    "\n",
    "if you have any suggestion/feedbacks/things your not happy about, please do share it in the [issue section](https://github.com/explodinggradients/ragas/issues). We love hearing from you 😁"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3cee41e9",
   "metadata": {},
   "source": [
    "### Test set generation\n",
    "\n",
    "Here you will learn how to generate a test set from your dataset using the Azure OpenAI endpoints."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "aa9ff398",
   "metadata": {},
   "outputs": [],
   "source": [
    "! git clone https://huggingface.co/datasets/explodinggradients/2023-llm-papers"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "d935a561",
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain_community.document_loaders import DirectoryLoader\n",
    "from ragas.testset.generator import TestsetGenerator\n",
    "from ragas.testset.evolutions import simple, reasoning, multi_context\n",
    "\n",
    "\n",
    "loader = DirectoryLoader(\n",
    "    \"./2023-llm-papers/\", use_multithreading=True, silent_errors=True, sample_size=1\n",
    ")\n",
    "documents = loader.load()\n",
    "\n",
    "for document in documents:\n",
    "    document.metadata[\"filename\"] = document.metadata[\"source\"]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c8f735a7",
   "metadata": {},
   "source": [
    "Use the `azure_model` and `azure_embedding` that we initialized in above section to generate the test set"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "04abc4b1",
   "metadata": {},
   "outputs": [],
   "source": [
    "generator = TestsetGenerator.from_langchain(\n",
    "    generator_llm=azure_model, critic_llm=azure_model, embeddings=azure_embeddings\n",
    ")\n",
    "\n",
    "testset = generator.generate_with_langchain_docs(\n",
    "    documents,\n",
    "    test_size=10,\n",
    "    raise_exceptions=False,\n",
    "    with_debugging_logs=False,\n",
    "    distributions={simple: 0.5, reasoning: 0.25, multi_context: 0.25},\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d2f5a7f7",
   "metadata": {},
   "source": [
    "testset.to_pandas()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.8"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
